AITopics | stochastic shortest path

eeb69a3cb92300456b6a5f4162093851-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 19:35:31 GMT

We study the Stochastic Shortest Path (SSP) problem in which an agent has to reach a goal state in minimum total expected cost. In the learning formulation ofthe problem, the agent has no prior knowledge about the costs and dynamics of the model. She repeatedly interacts with the model forK episodes, and has to minimize her regret. In this work we show that the minimax regret for this setting is eO( p (B2?+B?)|S||A|K)whereB?

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Nevada (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

367147f1755502d9bc6189f8e2c3005d-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 05:36:07 GMT

algorithm, eb-ssp, ssp, (13 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.46)

Add feedback

Minimax Regret for Stochastic Shortest Path

Neural Information Processing SystemsDec-25-2025, 05:00:18 GMT

We study the Stochastic Shortest Path (SSP) problem in which an agent has to reach a goal state in minimum total expected cost. In the learning formulation of the problem, the agent has no prior knowledge about the costs and dynamics of the model. She repeatedly interacts with the model for $K$ episodes, and has to minimize her regret. In this work we show that the minimax regret for this setting is $\widetilde O(\sqrt{ (B_\star^2 + B_\star) |S| |A| K})$ where $B_\star$ is a bound on the expected cost of the optimal policy from any state, $S$ is the state space, and $A$ is the action space.

minimax regret, name change, stochastic shortest path, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.62)

Add feedback

Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path

Neural Information Processing SystemsDec-24-2025, 04:03:51 GMT

We introduce a generic template for developing regret minimization algorithms in the Stochastic Shortest Path (SSP) model, which achieves minimax optimal regret as long as certain properties are ensured. The key of our analysis is a new technique called implicit finite-horizon approximation, which approximates the SSP model by a finite-horizon counterpart only in the analysis without explicit implementation. Using this template, we develop two new algorithms: the first one is model-free (the first in the literature to our knowledge) and minimax optimal under strictly positive costs; the second one is model-based and minimax optimal even with zero-cost state-action pairs, matching the best existing result from [Tarbouriech et al., 2021b]. Importantly, both algorithms admit highly sparse updates, making them computationally more efficient than all existing algorithms. Moreover, both can be made completely parameter-free.

approximation and efficient optimal algorithm, implicit finite-horizon approximation, stochastic shortest path, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.89)

Add feedback

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Neural Information Processing SystemsDec-23-2025, 23:53:46 GMT

We study the problem of learning in the stochastic shortest path (SSP) setting, where an agent seeks to minimize the expected cost accumulated before reaching a goal state. We design a novel model-based algorithm EB-SSP that carefully skews the empirical transitions and perturbs the empirical costs with an exploration bonus to induce an optimistic SSP problem whose associated value iteration scheme is guaranteed to converge. We prove that EB-SSP achieves the minimax regret rate $\widetilde{O}(B_{\star} \sqrt{S A K})$, where $K$ is the number of episodes, $S$ is the number of states, $A$ is the number of actions and $B_{\star}$ bounds the expected cumulative cost of the optimal policy from any state, thus closing the gap with the lower bound. Interestingly, EB-SSP obtains this result while being parameter-free, i.e., it does not require any prior knowledge of $B_{\star}$, nor of $T_{\star}$, which bounds the expected time-to-goal of the optimal policy from any state. Furthermore, we illustrate various cases (e.g., positive costs, or general costs when an order-accurate estimate of $T_{\star}$ is available) where the regret only contains a logarithmic dependence on $T_{\star}$, thus yielding the first (nearly) horizon-free regret bound beyond the finite-horizon MDP setting.

horizon-free regret, name change, stochastic shortest path, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

dbb5180957513805ebeea787b8c66ac9-Paper-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 09:59:05 GMT

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Minimax Regret for Stochastic Shortest Path

Neural Information Processing SystemsAug-18-2025, 16:29:08 GMT

We study the Stochastic Shortest Path (SSP) problem in which an agent has to reach a goal state in minimum total expected cost. In the learning formulation of the problem, the agent has no prior knowledge about the costs and dynamics of the model. She repeatedly interacts with the model for K episodes, and has to minimize her regret.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > Nevada (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

Add feedback

367147f1755502d9bc6189f8e2c3005d-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 03:54:02 GMT

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.46)

Add feedback

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Neural Information Processing SystemsMay-26-2025, 18:02:51 GMT

We study the problem of learning in the stochastic shortest path (SSP) setting, where an agent seeks to minimize the expected cost accumulated before reaching a goal state. We design a novel model-based algorithm EB-SSP that carefully skews the empirical transitions and perturbs the empirical costs with an exploration bonus to induce an optimistic SSP problem whose associated value iteration scheme is guaranteed to converge. We prove that EB-SSP achieves the minimax regret rate \widetilde{O}(B_{\star} \sqrt{S A K}), where K is the number of episodes, S is the number of states, A is the number of actions and B_{\star} bounds the expected cumulative cost of the optimal policy from any state, thus closing the gap with the lower bound. Interestingly, EB-SSP obtains this result while being parameter-free, i.e., it does not require any prior knowledge of B_{\star}, nor of T_{\star}, which bounds the expected time-to-goal of the optimal policy from any state. Furthermore, we illustrate various cases (e.g., positive costs, or general costs when an order-accurate estimate of T_{\star} is available) where the regret only contains a logarithmic dependence on T_{\star}, thus yielding the first (nearly) horizon-free regret bound beyond the finite-horizon MDP setting.

artificial intelligence, machine learning, stochastic shortest path, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.64)

Add feedback

Minimax Regret for Stochastic Shortest Path

Neural Information Processing SystemsJan-19-2025, 12:39:28 GMT

We study the Stochastic Shortest Path (SSP) problem in which an agent has to reach a goal state in minimum total expected cost. In the learning formulation of the problem, the agent has no prior knowledge about the costs and dynamics of the model. She repeatedly interacts with the model for K episodes, and has to minimize her regret. In this work we show that the minimax regret for this setting is \widetilde O(\sqrt{ (B_\star 2 B_\star) S A K}) where B_\star is a bound on the expected cost of the optimal policy from any state, S is the state space, and A is the action space. Our algorithm is based on a novel reduction from SSP to finite-horizon MDPs.

minimax regret, sqrt, stochastic shortest path, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.64)
Information Technology > Artificial Intelligence > Machine Learning (0.60)

Add feedback

Collaborating Authors

stochastic shortest path

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

eeb69a3cb92300456b6a5f4162093851-Paper.pdf

367147f1755502d9bc6189f8e2c3005d-Paper.pdf

Minimax Regret for Stochastic Shortest Path

Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

dbb5180957513805ebeea787b8c66ac9-Paper-Conference.pdf

Minimax Regret for Stochastic Shortest Path

367147f1755502d9bc6189f8e2c3005d-Paper.pdf

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Minimax Regret for Stochastic Shortest Path